Now that the 2020 election is officially over and Biden was elected as the President of the United States, it is important that I reflect on my prediction model. I am excited to see how I cold learn from my model for future models that I create.
Let’s first recap on my prediction model to get a better picture of what it was.
My prediction model was an ensemble model that predicted the popular vote share for each state .
Given that the Time For Change Model was an inspiration, I decided to focus my model on historical republican vote share as Trump was the incumbent for the 2020 election and incumbency was one predictor in the Time For Change Model.
I decided to separate America into three categories - red states, blue states, and battleground states - for my model to adjust for overfitting. The grouping were based on how FiveThirtyEight grouped states.
My model used the following data:
In my model, I decided to classify approval, Q2 GDP growth, and turnout as fundamentals
Thus, my ensemble model weighted the poll model (using only polls) by 0.96 and the fundamental model (using only fundamentals) by 0.04 as I weighted the model based on FiveThirtyEight’s reasoning that polls are better predictors as the election nears since fundamentals become more noisy instead.
Trump vote share = 0.96*Poll + 0.04*FundamentalMy final prediction using the ensemble model was that Biden was projected to win 310 electoral votes while Trump is projected to win 228 votes, meaning Biden would become president-elect of the United States.
Overall, I am pretty satisfied with how my model turned out. While I did miss a few states, I was quite happy that predicted some battleground states correctly.
Above is a comparison between my predictions and the actual results of the 2020 election. As you can see, the states that I got wrong were battleground states. However, I would like to say that the predictive intervals for the battleground states did capture the true result.
Moreover, let’s take a look into the plot above, which plots the actual two-party vote share for Trump against my predictions for Trump. The blue points represent states Biden won and the red points represent states Trump won.
Furthermore, the map above shows the difference between Trump’s actual and predicted two party vote share in each state. A negative difference means that Trump was overpredicted for that particular state while a positive difference means that Trump was underpredicted for that particular state. I will say that it is interesting where Trump was greatly overpredicted or greatly underpredicted are states that are not battleground states. This makes sense because states that are traditionally red or blue and not battleground typically have less polling as there is a small chance that those states will flip. This is why we may see a state like Alaska with little polling where Trump is greatly overpredicted there.
The above historgram shows the error distribution for my prediction model and the model’s average error seems to be mainly normally distributed around 0.